INDIAN ELECTION DATASET

Explore the Election data of all the state from 1978 -2011 in INDIA.

INTRODUCTION

About Dataset :

This database contains detailed winner candidate‐level data for elections of India’s national and state legislatures. The data span 1977‐2015, with each row representing a candidate that ran for office in that state‐year.

About- Project

This Exploratory Data Analysis project is a part of "Data Analysis with Python: Zero to Pandas" course structured and provided by Jovian. In this project, we'll analyse the relationship between the different features of the INDIAN ELECTION included in this dataset namely the distribution of SEX among the CANDIDATE, percentage of male, female, other candidate, variation in the total vote polled ,electors in state. The graphical representation and visualisation of data using matplotlib and seaborn library in python helps us to easily understand a lot better about the dataset.

Dataset - Source:

The dataset is obtained from Kaggle. Downloading the dataset This dataset has only 1 file 'indian-state-level-election.csv'

Opendatasets is used to download all the datasets. Let's start by installing this library.

pasting the kaggle link of datset

importing the dataset

Giving The Name of File

Importing the pandas

Loading the dataset

Data Cleaning and Info

Import NUmpy

counting all unique values in each column

counting only states

Display All the column and Rows

EXPLORATORY DATA ANALYSIS AND VISUALISATION

Graphical Analysis for Visualisation

The scatter plot compares the total vote poll and the number of electors for different assembly constituencies in an Indian election dataset. It visualizes the relationship between the variables, providing insights into their correlation. The plot helps analyze the data but requires additional context and domain knowledge for definitive conclusions.

In this axis is interchanged The scatter plot compares the total vote poll and the number of electors for different assembly constituencies in an Indian election dataset. It visualizes the relationship between the variables, providing insights into their correlation. The plot helps analyze the data but requires additional context and domain knowledge for definitive conclusions.

In this i add 'hue' and's' to add additional visual information to a plot The scatter plot compares the total vote poll and the number of electors for different assembly constituencies in an Indian election dataset. It visualizes the relationship between the variables, providing insights into their correlation. The plot helps analyze the data but requires additional context and domain knowledge for definitive conclusions.

Changing THE Default Values And Plotting HIstogram

From the above Histogram it is clearly observed that Uttarpradesh has most or highest total vote polled from this we can eaisly observed UP has highest population and GOA, Manipur ,Nagaland,Puducherry has least vote polled.

From the above Histogram it is clearly observed that General Candidates are Highest Electors and Similiarly BL and SANGH are very less.

From the above Histogram it is clearly observed that General Candidates are Highest Electors and Similiarly BL and SANGH are very less. Each category's histogram is stacked on top of each other, providing a visual comparison of their distributions.

From the Above code snippet adjusts the font size in matplotlib and sets the font scale in seaborn for the subsequent plot. It creates a pair plot using sns.pairplot for the 'ind_elec_state_df' DataFrame, with each data point colored based on the 'ac_type' variable, allowing for visual exploration of pairwise relationships in the dataset. A pair plot gives pairwise relationships in a dataset. hue ='ac_type'

From the above code snippet sets the font scale in seaborn and creates a pair plot using sns.pairplot for the 'ind_elec_state_df' DataFrame. The data points in the plot are colored based on the 'cand_sex' variable, allowing for visual analysis of pairwise relationships while considering the gender of the candidates.

hue ='cand_sex'

Again Changing THe Scale

from the above code snippet adjusts the font size and figure size using matplotlib settings. It creates a box plot using sns.boxplot to visualize the distribution of 'totvotpoll' (total vote poll) for different 'ac_type' (assembly constituency types) in the 'ind_elec_state_df' DataFrame. The box plot is further categorized by 'cand_sex' (candidate's gender).

Insight of this graph is that the General Candidate gets highest Total vote Polled.

CHANGING THE DATATYPE OF COLUMN

Displaying the Full Candidate

From the above scatterplot result, we can conclude that the year wise data of total vote polled and year 2007 gets hoghest and year 1979 gets lowest total vote polled

From the above the distplot result we conclude that the density Distribution of Total Votes Polled in Each State

From the above the rugplot result we conclude that the Frequency Distribution of Total Votes Polled in Each State A rug plot is a plot of data for a single quantitative variable, displayed as marks along an axis. It is used to visualise the distribution of the data. As such it is analogous to a histogram with zero-width bins, or a one-dimensional scatter plot.

from the above violinplot we conclude that the electors vs candidate sex MAle = highest 3rd gender = lowest

from the above striplot we conclude that electors distribution w.r.t. to candidate sex MAle = highest 3rd gender = lowest

from the above graph we conclude that the electors distribution in 1e6 w.r.t to acc_no of candidate that is vidhansabha seat number counting from to 425

The provided code utilizes the Seaborn and Matplotlib libraries to create a bar plot for visualizing the number of electors in India over different years.

The provided code utilizes the Seaborn and Matplotlib libraries to create a box plot for visualizing the distribution of the number of electors across different years in India, grouped by candidate sex

The provided code utilizes the Matplotlib and Seaborn libraries to create a violin plot for visualizing the distribution of the number of electors across different years in India.

Overall, the code generates a grid of distribution plots using the FacetGrid functionality of Seaborn. Each subplot represents the distribution of the 'year' variable for different combinations of 'ac_type' and 'cand_sex' values. The visual settings, such as font scale and figure size, are customized for better readability and presentation.

fromthe above code will create a scatterplot with the appropriate font size, title, axis labels, and marker colors based on the "cand_sex" column from the ind_elec_state_df DataFrame. The x-axis labels are rotated by 45 degrees for better visibility.

From the above the date which conclude about the missing values in indian election state by heatmap

The above bargraph shows the Account type of electors which is in y axis and scale is 1e^6 zeroes and with the help of this we can easily get the number of electors on the basis of category like general,sc,st etc

The above graph shows the Countplot & Pie Chart of candidate sex and with the help of this we can easily get the counts of MAle, female and trans candidate

The purpose of this visualization is to identify patterns and relationships among the variables in the ind_elec_state_df DataFrame based on their correlation. It can help in understanding the interdependencies and similarities between different columns and can provide insights into the data structure.

Q/A Exploration

Q) How many total candidate are takes part in indian election from 1978 - 2011?

Q) What is total number of Unique Candidate in the list those save their seat in the next year election ?

Q) How many total seat in the election ?

Q) what is the average vote in bihar

Q) what is the number of electors those gets > 13500 and < 13500

Q) What is the details of candidate of maximum and minimum electors ?

Q) count the total number of candidate in that area?

Q ) What is max and min total vote polled of a candidate?

Q) WHat is total count of Male and Female Candidates

Q) What is the count total candidates group by ac_name?

Q) Give a analysis on state Bihar

Q What is Data Types of state Bihar?

Q) What is Total Vote polled sum?

Q) What is the type of ind_elec_bihar_df.partyabbre?

Q) What is the datatypes of Bihar states?

Q) Give information about state Bihar ?

Q) What is the distribution of candidate sex in the dataset overall country?

Q) what is total count of respective gender of candidate N and total candidate?

Inferences and Conclusion

Based on the Indian election database from 1978 to 2015, several inferences and conclusions can be drawn. 1) Voting Patterns and Trends: Analyzing the historical data can reveal long-term voting patterns and trends in Indian elections. This includes understanding shifts in voter preferences, party dominance in specific regions or time periods, and the impact of socio-political factors on electoral outcomes.

2) Party Performance: The database can provide insights into the performance of political parties over the years. It can help identify parties that have consistently performed well or experienced fluctuations in their electoral fortunes. This analysis can shed light on the factors influencing party success or failure.

3) Regional Dynamics: By examining the data, it is possible to identify regional dynamics and their impact on electoral outcomes. This includes understanding voting behavior in different states, the rise and fall of regional parties, and the influence of regional issues on voter decisions.

4) Impact of Demographics: The database can be used to study the impact of demographics on elections. By analyzing variables such as age, gender, caste, and religion, researchers can identify demographic groups that have significant influence on voting patterns and outcomes.

5) Electoral Reforms: Examining the database can highlight the need for electoral reforms. It can reveal issues such as voter turnout disparities, irregularities in the electoral process, or underrepresentation of certain communities. These findings can support calls for reforms aimed at improving the fairness and inclusiveness of the electoral system.

6) Political Polarization: The data can provide insights into political polarization and ideological shifts over time. By examining party manifestos, candidate profiles, and voter behavior, researchers can observe changes in political ideologies and the level of polarization among different sections of the electorate.

7) Election Campaign Strategies: The database can offer insights into the effectiveness of various election campaign strategies employed by political parties. Analyzing campaign expenditures, media coverage, and candidate profiles can provide valuable information on what resonates with voters and influences their decision-making.

References and Future Work

Refrence : kaggle

1) Historical Analysis: The database can be used to conduct in-depth historical analysis of electoral patterns, trends, and shifts in Indian politics over the years. Researchers and political scientists can explore various aspects such as voter turnout, party performance, candidate profiles, and regional dynamics, providing valuable insights into the evolution of democracy in India.

2) Predictive Modeling: Using the historical data as a foundation, predictive modeling techniques can be employed to forecast election outcomes in future elections. By analyzing historical patterns, demographics, and socio-political factors, data scientists can develop predictive models that provide estimates or probabilities of party performance, allowing for better strategic planning and decision-making.

3) Policy Impact Assessment: The database can facilitate the assessment of policy impact on electoral outcomes. By correlating government policies and electoral results, researchers can gauge the influence of specific policies or governance decisions on voter behavior. This analysis can contribute to evidence-based policymaking and help politicians understand the potential electoral implications of policy choices.

4) Sentiment Analysis and Opinion Mining: With the availability of vast textual data from election campaigns, speeches, and media coverage, sentiment analysis and opinion mining techniques can be employed to gauge public sentiment towards political parties, candidates, and issues. This can provide insights into voter preferences, public perception, and the effectiveness of different political communication strategies.

5) Comparative Studies: The database can be utilized for comparative studies across different elections and regions within India. By examining similarities and differences in voting patterns, party performances, and electoral dynamics, researchers can identify factors that contribute to variations in outcomes, enabling a better understanding of regional and cultural influences on Indian elections.